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AMENDMENTS TO THE CLAIMS : 

This listing of claims will replace all prior versions, 
and listings, of claims in the application: 

LISTING OF CLAIMS : 

1-21. (canceled) 

22. (currently amended) A method for gene mapping to 
locate a gene associated with a certain phenotype from a dataset 
of chromosome and phenotype data from — a — databaoc, — comprising 

analysing linkage disequilibrium — between by analyzing an 

association between phenotype and genetic marks markers m if 
comprising : 

i) searching from the data said dataset for all marker patterns P 
that satisfy a pattern evaluation function e(P) t wherein 

a: the marker patterns are expressions within the 
database said dataset comprising genetic markers and their 
alleles and zero or more of the following: individual 
covariates, environmental variables and auxiliary phenotypes; and 

b: the pattern evaluation function e(P) is a measure 
of the association between the marker pattern P and a phenotype 
being studied, 

ii) scoring each marker m± of the data with a marker score s(rni), 
which is a function of the set S± defined as the set of marker 
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patterns overlapping the marker mi and satisfying the pattern 
evaluation function e as defined in step i) , and 

iii) locating said gene to the marker m± having the best score 
s (m j ) , wherein the best score is the highest obtained score if 
said mapping the location of a gene by evaluating the scores 
s(nii) of all the markers mi in the data which is determined by 
maximizing the score if [ [the] ] said scoring function is designed 
to give higher scores closer to the gene, and on minimising tho 
scor e — i£ — the — scoring — function — i-s — designed to — give — lower GcorCu 
closer to the gene on locating said gene to a chromosomal region 
containing a set of best scoring markers . 

23. (previously presented) The method of claim 22, 
wherein the chromosome data consists of either haplotypes or 
genotypes . 

24. (previously presented) The method of claim 23, 
wherein said haplotypes and genotypes contain flexible regions. 

25-29. (canceled) 

30. (currently amended) The method of claim 22, 

wherein 

a) the phenotype being studied is qualitative, and 

b) the pattern evaluation function e(P) has the form e(P) = 
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true if and only if e' (P) > x, where e' (P) is the (signed) 
signed association measure % 2 and x is a user specified 
minimum value wherein said signed value of the % 2 is 
negative if the relative frequency of the halotype pattern 
among the control chromosomes is higher than that of the 
trait -associatead chromosomes, and otherwise positive , and 
c) the score s (m±) of marker m± is the size of Si, also 
called marker-wise pattern frequency of m± and denoted by 
f(m±) . 

31. (previously presented) The method of claim 22, 

wherein 

a) the pattern evaluation function e(P) has the form 
e(P) = true if and only if e'(P) > x, where e'(P) is the 
absolute frequency of pattern P in the data and x is a user- 
specified value, 

b) in order to derive the score s (mi) , the p value 
(statistical significance) of each marker pattern P in 
determining the phenotype being studied is evaluated, and 

c) the score s (m±) is the distance between the 
observed p value distribution of patterns in S± and the 
uniform distribution, defined as average of (pi - q±) log (p± 
/ qi) over all i = l..n, where n is the number of haplotype 
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patterns in S±, p± is the ith smallest p value in S it and q± is 
the expectation of the ith smallest p value, if the p values 
were randomly drawn from the uniform distribution. 

32. (previously presented) The method of claim 31, 
where the p value is computed using a linear model of form Y = p^X\ 
+ ... + PkXk + & z + Po» where the dependent variable Y is the 
phenotype being studied, X 1 through X* are covariates, and Z is a 
dummy variable for the occurrence of the haplotype pattern, and 
the coefficients a and f3* are adjusted for best fit, and then 
the significance of 2 as a covariate is assessed using a t test 
with the null hypothesis xx a = 0" . 

33. (previously presented) The method of claim 22, 
further refining each score s (m±) by replacing it by the marker- 
wise p value of the score s (m±) , where the statistical 
significance of s (m±) is measured against the null hypotheses that 
there is no gene effect. 

34. (previously presented) The method of claim 22, 
wherein an area returned from a prediction of a gene location is 
contiguous or fragmented or a point . 

3 5-36. ( canceled) 
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37. (previously presented) The method of claim 22, 
wherein the location of the gene, predicted as a function of the 
scores s (m±) and based on maximizing or minimizing the score, is 
determined by evaluating the marker scores or by visualization. 

38. (canceled) 

39. (previously presented) A computer- readable data 
storage medium having computer-executable program code stored 
thereon operative to perform the method of claim 22 when executed 
on a computer. 

40. (previously presented) A computer system having 
executable program code that performs the method of claim 22. 

41. (new) The method of claim 22, comprising 
searching patterns P by the following algorithm: 

Input : 

• set U of marker patterns 

• evaluation function e(P) for patterns P in U 

• generalization relation < for patterns in U, where function e 
and relation < are such that if e(P) is true and P' < P, then 
e(P ! ) is also true 

Output : 
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• set S of patterns P in U satisfying e(P) 
Definition: 

. function Lss: U -> 2 U , Lss(P) = { P' e L7 | P < P' and P' *P 
and there is no P' ' such that P' ' ^P and P' ' * P' and P < P' ' 
< p'} is the set of least special specializations of pattern 
P. 

Method: 

Initialize set S and set E of evaluated patterns to be 
empty sets. 

Let Gen be the set of patterns P in U, for which there are 
no patterns P' in U : P' < P. 

Recursively evaluate each pattern P in Gen in depth- first 
order . 

Method for recursively evaluating pattern P : 
Insert P into set E. 
If e ( P) is t rue , then { 

Insert P into set S. 

Find set Spec = { P' e Lss(P) | P' £ S }. 
Recursively evaluate each pattern in Spec. 

} 



7 



Docket No. 3502-1038 
Appln. No. 09/875,935 

42. (new) The method of claim 22, comprising 

searching patterns P by the following algorithm: 

Input : 

• set U of marker patterns 

• evaluation function e(P) for patterns P in U 

• frequency threshold x 

• generalization relation < for patterns in U, where relation < 
is such that if P' < P, then occurrence of pattern P implies 
occurrence of pattern P'. 

Output : 

• set S = {P e U | e(P) and ae(P) are true} of patterns, where 
ae(P) is true if and only if the frequency of pattern P 
exceeds a given threshold x 

Definition : 

. function Lss: U -> 2 U , Lss(P) = { P f e U \ P < P' and P' *P 
and there is no P' ' such that P' ' *P and P' ' * P' and P < P' ' 
< P'} is the set of least special specializations of pattern 
P. 

Method: 
Method: 
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Initialize set S and set E of evaluated patterns to be 
empty sets. 

Let Gen be the set of patterns P in U f for which there are 
no patterns P ' in U : P ' < P. 

Recursively evaluate each pattern P in Gen in depth- first 

order . 

Method for recursively evaluating pattern P : 
Insert P into set E. 
if ae(P) is true, then { 

if e(P) is true then { 

insert P into set S. 

} 

Find set Spec = { P' e Lss(P) | P' £ £ }. 
Recursively evaluate each pattern in Spec. 

} 

43. (new) The method of claim 22, comprising 

searching patterns P by the following algorithm: 

Input : 

• marker map M - (m lf . . . , m k ) 

. phenotype vector Y = (y 2/ y„) , where yi e {'disease- 

associated 1 , 1 control ' } for all i 
. n x k haplotype matrix H = {hji} , where h j± is the allele at 
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marker i in haplotype j 

• threshold value x for % 2 test for association between 
phenotype and pattern 

• maximum pattern length I 

• maximum number of gaps g 

• maximum gap size s 

Output : 

. set S of patterns P in U satisfying e(P), where U consists of 
marker-allele assignments that adhere to parameters 1, g, and 
s, and e(P) is true if and only if % 2 test on P using 
haplotype matrix H and phenotypes Y exceeds given threshold x 

Definitions : 

• pattern {p lf p k ) matches haplotype j, if and only if for 
all markers i : pi = hj± or p± = * 

• frequency of pattern P is the number of haplotypes in H 
matched by P 

. a gap in pattern (p lf p k ) is set {i, j} of markers, i < 

j and i > 1 and j < k, for which p ± . x * * and p J+i * * and p n = * 
for each n j} . 

Method: 

Initialize set S to be empty set. 



10 



Docket No. 3502-1038 
Appln. No. 09/875,935 

Calculate lower bound lb for pattern frequency: lb = n A n x / 
(7i c K + n A x) , where 7i A is 

the number of disease-associated haplotypes, n c is the 
number of control haplotypes, 

and 7i is 7t A +7i c • 

Initialize (pi, p k ) to be empty pattern (*, *) . 

For each i e {l, . .., k} and for each a <= A lf where A± is 
the set of alleles at marker i: { 
Let Pi : = a. 

Recursively evaluate pattern (p lf p k ) ranging from 

i to i and all its extensions 
to the right. 
Let pi : = * . 

} 

Method for recursively evaluating pattern (p X/ p k ) ranging 

from i to j : 

If x 2 statistic for pattern (p l7 p k ) is greater than or 

equal to x and Pj * * , then 

insert P into set S. 
If j < k and j - i + 1 < 1 and the frequency of pattern 
(Pi, .../ Pa) is greater than or 
equal to lb, then { 

For each possible allele a at marker j + a e Aj+i : { 
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Let p j+x : = a. 

Recursively evaluate pattern (p lf . .., Pk) ranging 

from i to j +1- 
} 

Let p j+1 : = * . 

If Pi * * and the number of gaps in pattern (p lf . . 

p k ) is smaller than gr, then 

introduce a new gap starting at marker j+1, and 

recursively 

evaluate pattern (pi, . .., p k ) ranging from i to 

j + 1. 

If pi = * and the number of adjacent markers j' < j : 
pj. = * is smaller than s, then 

extend the current gap over marker j + l f and 
recursively evaluate 

pattern (p i7 p k ) ranging from i to j + 

} 

44. (new) The method of claim 22, comprising 

searching patterns P by the following algorithm: 

Input : 

• set U of marker patterns 

• evaluation function e(P) for patterns P in 17 

. generalization relation < for patterns in U, where function e 

12 



Docket No. 3502-1038 
Appln. No. 09/875,935 



and relation < are such that if e(P) is true and P' < P, then 
e(P ! ) is also true 

Output : 

• set S of patterns P in U satisfying e(P) 
Definitions : 

. function Lgg: U 2 U , Lgg(P) = { P' e U \ P > P' and P' *P 
and there is no P' ' such that P' ' *P and P" * P' and P > P' ' 
> P'} is the set of least general generalizations of pattern 
P. 

. function Lss: U 2 U , Lss(P) = { P' e U \ P < P' and P> *P 
and there is no P' ' such that P' ' *P and P' ' * P' and P < P' ' 
< p'} is the set of least special specializations of pattern 
P. 

Method: 

Initialize set S and set Q to be empty sets. 
Initialize set F to contain patterns P in C7, for which 
there are no patterns P ' in U : P' < P. 

Repeat the following steps while F is a non-empty set : 

For each P e F: { 

if e(P) = true, then insert P into set S, 
else remove P from set F. 
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} 

Let Q : = Q u F. 

Initialize set C to be empty set. 
For each P e F : 

Let C:=Cu{P'eL7|P'€= Lss ( P) and for all 

P' ' e Lgg(P' ) : P' ' e 0} . 

Let F : = C. 

45. (new) The method of claim 22, wherein marker 
patterns P are searched by the following algorithm: 

Input : 

• set U of marker patterns 

• evaluation function e(P) for patterns P in 17 

• frequency threshold x 

. generalization relation < for patterns in U, where relation < 
is such that if P' < P, then occurrence of pattern P implies 
occurrence of pattern P'. 

Output : 

. set S = {P e U | e(P) and ae(P) are true} of patterns, where 
ae(P) is true if and only if the frequency of pattern P 
exceeds a given threshold x 
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Definitions : 

. function Lgg: U -> 2 U , Lgg(P) = { P' e U \ P > P 1 and P' *P 
and there is no P' ' such that P' ' *P and P' ' ^ P' and P > P' ' 
> P'} is the set of least general generalizations of pattern 
P. 

• function Lss: U -> 2^, Lss(P) = { P r e L7 | P < P' and P' ^P 
and there is no P' ' such that P' ' ^P and P' ' * P' and P < P' ' 
< P'} is the set of least special specializations of pattern 
P. 

Method: 

Initialize set S and set Q to be empty sets. 
Initialize set F to contain patterns P in £7, for which 
there are no patterns P f in U : P' < P. 

Repeat the following steps while F is a non-empty set : 
For each P <= F: { 

if ae(P) = true, then { 

if e(P) is true, then insert P into set S. 

} 

else remove P from set F. 

} 

Let Q : = Q U F. 

Initialize set C to be empty set. 
For each P e F : 
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Let C : = Cu {P' eCJ | P' € Lss(P) and for all 
P' ' e Lgg(P' ) : P' ' € 0} . 

Let F : = C. 

46. (new) The method of claim 22, wherein the location 
of the gene, predicted as a function of the scores s(irii) and 
based on maximizing or minimizing the score, is predicted by the 
combination of most probable intervals for containing a trait- 
susceptibility locus that covers a proportion t ranging from 0 to 
100 % of the region covered by markers m± obtained by taking all 
such points in said region whose nearest marker is within the ic 
best scoring markers, and wherein k is selected so that the 
resulting area has length at most t times the length of said 
region . 

47. (new) The method of claim 22, wherein the location 
of the gene, predicted as a function of the scores s (m±) and 
based on maximizing or minimizing the score, is predicted to 
those points in the region covered by markers n\i whose nearest 
marker scores ... 

48. (new) The method of claim 22, comprising searching 
for multiple genes using patterns comprising several components 
referring to different potential gene loci, and scoring marker 
tuples instead of plain markers . 



